Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add async logrus hook for alert #36

Merged
merged 14 commits into from
May 22, 2024
Merged

Add async logrus hook for alert #36

merged 14 commits into from
May 22, 2024

Conversation

wanliqun
Copy link
Contributor

@wanliqun wanliqun commented May 8, 2024

This change is Reviewable

@wanliqun wanliqun requested a review from boqiu May 8, 2024 03:33
Copy link
Contributor

@boqiu boqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: 3 of 4 files reviewed, 13 unresolved discussions (waiting on @wanliqun)


log/hook/async.go line 13 at r1 (raw file):

)

type runningStatus uint32

status?

Moreover, it is overdesign to use enum for bool purpose.

Code quote:

runningStatus

log/hook/async.go line 30 at r1 (raw file):

	IdlePoolWorkers uint32 `default:"4"`
	// The max number of goroutines that can be running concurrently while busy.
	MaxConcurrentWorkers uint32 `default:"16"`

Suggest 1 thread by default to avoid rate limited by 3rd party alert service.


log/hook/async.go line 36 at r1 (raw file):

// AsyncWrapper is an async wrapper that uses goroutines to invoke the wrapped hook.
type AsyncWrapper struct {

I think AsyncHook is a better name.

Code quote:

AsyncWrapper

log/hook/async.go line 41 at r1 (raw file):

	// The flag to indicate running status: 0 - stopped, 1 - running.
	runningFlag uint32

Why not use sync.Bool instead?


log/hook/async.go line 54 at r1 (raw file):

}

func NewAsyncWrapper(wrappedHook logrus.Hook, opts AsyncOption) *AsyncWrapper {

hook is the right name, since it is not be wrapped yet.

Code quote:

wrappedHook

log/hook/async.go line 64 at r1 (raw file):

func (w *AsyncWrapper) Fire(entry *logrus.Entry) error {
	if !w.isRunning() {
		return errNotRunning

Why not starting the AsyncHook worker when instance created?


log/hook/async.go line 71 at r1 (raw file):

		return nil
	case <-w.ctx.Done():
		return w.ctx.Err()

What's the logrus behavior if error returned?

Moreover, it is OK to ignore when the ctx is done, which is triggered when program is terminating.


log/hook/async.go line 80 at r1 (raw file):

func (w *AsyncWrapper) Start(ctx context.Context) {
	if w.isRunning() { // already started?

I think the worker could be started when instance created.


log/hook/async.go line 107 at r1 (raw file):

}

func (w *AsyncWrapper) Stop() {

Unnecessary, since Start method has ctx parameter to indicate termination.


log/hook/async.go line 120 at r1 (raw file):

// boost starts extra goroutine to help empty out the job queue.
func (w *AsyncWrapper) boost(entry *logrus.Entry) error {

It seems the design is complex because of boost mechanism. Why not only limit the maximum concurrent number of goroutines?


log/hook/async.go line 149 at r1 (raw file):

}

func (w *AsyncWrapper) wrapCall(entry *logrus.Entry) {

fire(entry *logrus.Entry)?


log/hook/async.go line 154 at r1 (raw file):

		entryStr, _ := formatter.Format(entry)

		fmt.Printf("Asyn hook call error %v with entry %v\n", err, entryStr)
  1. why not use json format? text format is ok.
  2. how about to output to stderr?

log/hook/hook.go line 29 at r1 (raw file):

	Channels []string
	// async worker options
	Async AsyncOption

Optional to allow non-async mode?

Copy link
Contributor Author

@wanliqun wanliqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 4 files reviewed, 13 unresolved discussions (waiting on @boqiu)


log/hook/async.go line 13 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…

status?

Moreover, it is overdesign to use enum for bool purpose.

done


log/hook/async.go line 30 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…

Suggest 1 thread by default to avoid rate limited by 3rd party alert service.

done


log/hook/async.go line 36 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…

I think AsyncHook is a better name.

done


log/hook/async.go line 41 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…

Why not use sync.Bool instead?

removed


log/hook/async.go line 54 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…

hook is the right name, since it is not be wrapped yet.

done


log/hook/async.go line 64 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…

Why not starting the AsyncHook worker when instance created?

done


log/hook/async.go line 71 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…

What's the logrus behavior if error returned?

Moreover, it is OK to ignore when the ctx is done, which is triggered when program is terminating.

Done

By default, Logrus does not do anything with this error; it does not stop the execution of other hooks or the logging statement itself. The error is typically logged to stderr, and the program continues to execute.


log/hook/async.go line 80 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…

I think the worker could be started when instance created.

done


log/hook/async.go line 107 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…

Unnecessary, since Start method has ctx parameter to indicate termination.

done


log/hook/async.go line 120 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…

It seems the design is complex because of boost mechanism. Why not only limit the maximum concurrent number of goroutines?

done


log/hook/async.go line 149 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…

fire(entry *logrus.Entry)?

done


log/hook/async.go line 154 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…
  1. why not use json format? text format is ok.
  2. how about to output to stderr?

done


log/hook/hook.go line 29 at r1 (raw file):

Previously, boqiu (Bo QIU) wrote…

Optional to allow non-async mode?

done

@wanliqun wanliqun requested a review from boqiu May 8, 2024 06:48
Copy link
Contributor

@boqiu boqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r3, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @wanliqun)


log/log.go line 45 at r3 (raw file):

	logrus.SetLevel(level)

	err = hook.AddAlertHook(context.Background(), conf.AlertHook)

Better to allow client to specify a context for graceful termination. Otherwise, minor logs may be missed for notification.

BTW, it is important to specify a suitable timeout to send notification to dingtalk or telegram.


log/hook/async.go line 16 at r3 (raw file):

type AsyncOption struct {
	// Enable async mode or not
	Enabled bool

How about to leverage MaxWorkers variable to decide if async hook enabled?

MaxWorkers == 0 indicates async hook disabled.


log/hook/async.go line 18 at r3 (raw file):

	Enabled bool
	// The max number of worker goroutines that can be running concurrently.
	MaxWorkers uint32 `default:"4"`

Better to use 1 by default, so as to send alerts in sequence.


log/hook/async.go line 23 at r3 (raw file):

}

// AsyncHook is an async wrapper that uses goroutines to invoke the wrapped hook.

// AsyncHook allows to handle logrus.Hook asynchronously.


log/hook/async.go line 33 at r3 (raw file):

	jobQueue chan *logrus.Entry
	// The context for life cycle control.
	ctx context.Context

Could be removed and add a ctx parameter for start method.

The less struct fields, the better.


log/hook/async.go line 88 at r3 (raw file):

	if h.incrWorkerCounter() > h.MaxWorkers {
		h.decrWorkerCounter()
		return errors.New("async hook buffer is full")

Is it possible to notify that send queue is full and requires administrator to check logs in time?


log/hook/async.go line 92 at r3 (raw file):

	go func() {
		defer h.decrWorkerCounter()

In my mind, such boost design is not good enough. Instead, I suggest one thread for one channel, and send alerts in sequence. Once channel buffer is not enough, just notify admin to check logs in time (even such message will not be delivered in time).

Copy link
Contributor Author

@wanliqun wanliqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 7 unresolved discussions (waiting on @boqiu)


log/log.go line 45 at r3 (raw file):

Better to allow client to specify a context for graceful termination. Otherwise, minor logs may be missed for notification.

Done

BTW, it is important to specify a suitable timeout to send notification to dingtalk or telegram.
Good point. it's also an important consideration for both sync and async mode. Besides, due to the dingtalk sdk we've used doesn't support timeout operation, I will submit a separate PR to fix it.


log/hook/async.go line 16 at r3 (raw file):

Previously, boqiu (Bo QIU) wrote…

How about to leverage MaxWorkers variable to decide if async hook enabled?

MaxWorkers == 0 indicates async hook disabled.

Done


log/hook/async.go line 18 at r3 (raw file):

Previously, boqiu (Bo QIU) wrote…

Better to use 1 by default, so as to send alerts in sequence.

Done


log/hook/async.go line 23 at r3 (raw file):

Previously, boqiu (Bo QIU) wrote…

// AsyncHook allows to handle logrus.Hook asynchronously.

Done


log/hook/async.go line 33 at r3 (raw file):

Previously, boqiu (Bo QIU) wrote…

Could be removed and add a ctx parameter for start method.

The less struct fields, the better.

Done


log/hook/async.go line 88 at r3 (raw file):

Previously, boqiu (Bo QIU) wrote…

Is it possible to notify that send queue is full and requires administrator to check logs in time?

I've switched to notify the conjestion status synchronously to the alert channel, which should be enough for current stage.


log/hook/async.go line 92 at r3 (raw file):

Previously, boqiu (Bo QIU) wrote…

In my mind, such boost design is not good enough. Instead, I suggest one thread for one channel, and send alerts in sequence. Once channel buffer is not enough, just notify admin to check logs in time (even such message will not be delivered in time).

Done

@wanliqun wanliqun requested a review from boqiu May 10, 2024 10:01
@wanliqun wanliqun changed the title Add async wrapper for alert hooking Add async logrus hook for alert May 10, 2024
Copy link
Contributor

@boqiu boqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r4, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @wanliqun)


config/config.go line 17 at r4 (raw file):

//
// Note that it will panic and exit if any error happens.
func MustInit(ctx context.Context, wg *sync.WaitGroup, viperEnvPrefix string) {

Suggest to add a new method for graceful shutdown.


log/hook/async.go line 77 at r4 (raw file):

		defer wg.Done()

		var awg sync.WaitGroup

Why not reuse wg?

Copy link
Contributor Author

@wanliqun wanliqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @boqiu)


log/hook/async.go line 77 at r4 (raw file):

Previously, boqiu (Bo QIU) wrote…

Why not reuse wg?

`
// start launches worker goroutines and manages their lifecycle.
func (h *AsyncHook) start(ctx context.Context, wg *sync.WaitGroup) {
defer h.started.Store(true) // Mark the hook as started.

wg.Add(1)
go func() {
defer wg.Done()

var awg sync.WaitGroup
for i := 0; i < h.NumWorkers; i++ {
  awg.Add(1)
  go func() {
    defer awg.Done()
    h.worker(ctx)
  }()
}

awg.Wait()
h.drainJobQueue()

}()
}I'm afriad we can't as thewgis passed in as a parameter, and we can't guaratee it isn't used somewhere else. Under such case thewg.Wait()` might cause a dead lock.

Copy link
Contributor Author

@wanliqun wanliqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 6 files reviewed, 2 unresolved discussions (waiting on @boqiu)


config/config.go line 17 at r4 (raw file):

Previously, boqiu (Bo QIU) wrote…

Suggest to add a new method for graceful shutdown.

Done.

@wanliqun wanliqun requested a review from boqiu May 13, 2024 06:06
Copy link
Contributor

@boqiu boqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r5, 2 of 2 files at r6, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @wanliqun)


graceful/handler.go line 1 at r5 (raw file):

package graceful

terminate or shutdown is better.

Code quote:

graceful

Copy link
Contributor Author

@wanliqun wanliqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 7 files reviewed, 1 unresolved discussion (waiting on @boqiu)


graceful/handler.go line 1 at r5 (raw file):

Previously, boqiu (Bo QIU) wrote…

terminate or shutdown is better.

Done.

Copy link
Contributor

@boqiu boqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r7, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @wanliqun)


terminate/context.go line 1 at r7 (raw file):

package terminiate

Not used?

Copy link
Contributor Author

@wanliqun wanliqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 6 of 7 files reviewed, 1 unresolved discussion (waiting on @boqiu)


terminate/context.go line 1 at r7 (raw file):

Previously, boqiu (Bo QIU) wrote…

Not used?

Done.

Copy link
Contributor

@boqiu boqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r8, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @wanliqun)

@wanliqun wanliqun merged commit 2eb5218 into main May 22, 2024
2 checks passed
@wanliqun wanliqun deleted the async-alert branch May 22, 2024 02:01
@wanliqun wanliqun restored the async-alert branch May 22, 2024 06:29
@wanliqun wanliqun deleted the async-alert branch May 22, 2024 06:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants